Use safe_denominator to prevent division by zero. by odiazib · Pull Request #489 · eagles-project/mam4xx

odiazib · 2026-02-02T16:40:03Z

We are encountering a division-by-zero error in hetfrz. We rely on safe_denominator to prevent this. Note that PR #481 is also intended to address the same issue. We need to decide which approach to merge into master. @susburrows

kaizhangpnl

Thanks for fixing this. It looks good to me.

codecov · 2026-02-04T03:56:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.35%. Comparing base (34c4d5c) to head (92f7c2f).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #489      +/-   ##
==========================================
- Coverage   93.43%   93.35%   -0.08%     
==========================================
  Files         303      303              
  Lines       25177    24287     -890     
  Branches     2766     2766              
==========================================
- Hits        23523    22673     -850     
+ Misses       1654     1614      -40

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

susburrows

Hi @odiazib and @kaizhangpnl: I reviewed these changes and I think there may be a better approach. Can you please check my suggestion?

susburrows · 2026-02-05T01:12:05Z

src/mam4xx/hetfrz.hpp

  Real coat_ratio2 =
-      haero::max(6.0 * dr_so4_monolayers_dust * vol_core[0], 1.0e-36);
-  dstcoat[0] = coat_ratio1 / coat_ratio2;
+      haero::max(6.0 * dr_so4_monolayers_dust * vol_core[0], 0.0);


If I am reading the code correctly here, coat_ratio2 returns zero only in the case where vol_core = 0. (The other terms in this equation are constants.)

In turn, vol_core = 0 only when dmac = 0.

dmac is the dust mass air concentration. coat_ratio2 is used here to calculate the surface coating of the dust aerosol present in the air. If there is no dust aerosol (dmac=0), then it makes sense for coat_ratio2 to return NaN, because the entire calculation is physically meaningless in this condition.

Because of this, I suggest the following may be a more appropriate and efficient solution: check whether dmac exceeds a minimum threshold at line 1018; if not, skip the calculations in lines 1018-1029 entirely.

Similar reasoning can be applied to all of the subsequent code blocks.

There seems to be a bug here, since vol_core[0] is calculated as the BC core, not the dust core:

vol_core[0] = bcmpc / (specdens_bc * air_density);

In the original Fortran code, vol_core(2) and vol_core(3) are for dust core.

If vol_core[1] is used (vol_core(2) in Fortran), I think Susannah's fix is good.

It appears that mam4xx is doing the same thing as the Fortran code:

https://github.com/eagles-project/e3sm_mam4_refactor/blob/2e58607847ef2868e390c564039e8f04d648d9ac/components/eam/src/physics/cam/hetfrz_classnuc_cam.F90#L1127C2-L1130C40

Note that 1 is 0 in c++.

vol_core(1) = aer(bc_pcarbon)/(specdens_bc*rhoair) coat_ratio1 = vol_shell(1)*(r_bc*2._r8)*fac_volsfc_bc coat_ratio2 = max(6.0_r8*dr_so4_monolayers_dust*vol_core(1), 0.0_r8) dstcoat(1) = coat_ratio1/coat_ratio2

I had a double check: for dst_a1 and dst_a3, the code does use dstcoat[1] and dstcoat[2]:

mam4xx/src/mam4xx/hetfrz.hpp

Lines 1030 to 1047 in 92f7c2f

// dust_a1

coat_ratio1 = vol_shell[1] * (r_dust_a1 * 2.0) * fac_volsfc_dust_a1;

coat_ratio2 = haero::max(6.0 * dr_so4_monolayers_dust * vol_core[1], 0.0);

mult_coat_ratio2 = FloatingPoint<Real>::safe_denominator(coat_ratio2, tol);

dstcoat[1] = coat_ratio1 * mult_coat_ratio2;

// dust_a3

vol_shell[2] = so4mc / (specdens_so4 * air_density) +

pommc / (specdens_pom * air_density) +

soamc / (specdens_soa * air_density) +

mommc / (specdens_mom * air_density);

vol_core[2] = dmc / (specdens_dst * air_density);

coat_ratio1 = vol_shell[2] * (r_dust_a3 * 2.0) * fac_volsfc_dust_a3;

coat_ratio2 = haero::max(6.0 * dr_so4_monolayers_dust * vol_core[2], 0.0);

mult_coat_ratio2 = FloatingPoint<Real>::safe_denominator(coat_ratio2, tol);

dstcoat[2] = coat_ratio1 * mult_coat_ratio2;

so this part is correct.

The code seems to use dstcoat[0] (or dstcoat(1) in fortran) to represent coating on BC. This made me confused, but it seems to be correct now to me (it seems the code also assume dr_so4_monolayers_BC = dr_so4_monolayers_dust).

If we want a if-then condition, I think we should determine if vol_core[0] larger than zero directly. On the other hand, I think the current PR code is good enough.

It seems the current PR should be sufficient to stop the NaN errors (at least in this spot). However, an if-then check would allow several computations to be skipped. If this happens frequently enough (and if similar code bloack happen elsewhere), I assume this might result in some performance gains, which would be helpful.

@odiazib , do you have a sense of how much performance could potentially be gained from these types of changes?

prevent division by zero.

e815b14

odiazib requested a review from kaizhangpnl February 2, 2026 16:40

kaizhangpnl approved these changes Feb 2, 2026

View reviewed changes

save

92f7c2f

odiazib mentioned this pull request Feb 4, 2026

Division by zero error in hetftz #490

Open

susburrows reviewed Feb 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use safe_denominator to prevent division by zero.#489

Use safe_denominator to prevent division by zero.#489
odiazib wants to merge 2 commits intomainfrom
oscar/hetfrz-safe-denominator

odiazib commented Feb 2, 2026 •

edited

Loading

Uh oh!

kaizhangpnl left a comment

Uh oh!

codecov bot commented Feb 4, 2026

Uh oh!

susburrows left a comment

Uh oh!

susburrows Feb 5, 2026 •

edited

Loading

Uh oh!

kaizhangpnl Feb 5, 2026

Uh oh!

kaizhangpnl Feb 5, 2026

Uh oh!

odiazib Feb 5, 2026

Uh oh!

kaizhangpnl Feb 5, 2026 •

edited

Loading

Uh oh!

susburrows Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


	// dust_a1
	coat_ratio1 = vol_shell[1] * (r_dust_a1 * 2.0) * fac_volsfc_dust_a1;
	coat_ratio2 = haero::max(6.0 * dr_so4_monolayers_dust * vol_core[1], 0.0);
	mult_coat_ratio2 = FloatingPoint<Real>::safe_denominator(coat_ratio2, tol);
	dstcoat[1] = coat_ratio1 * mult_coat_ratio2;

	// dust_a3
	vol_shell[2] = so4mc / (specdens_so4 * air_density) +
	pommc / (specdens_pom * air_density) +
	soamc / (specdens_soa * air_density) +
	mommc / (specdens_mom * air_density);

	vol_core[2] = dmc / (specdens_dst * air_density);
	coat_ratio1 = vol_shell[2] * (r_dust_a3 * 2.0) * fac_volsfc_dust_a3;
	coat_ratio2 = haero::max(6.0 * dr_so4_monolayers_dust * vol_core[2], 0.0);
	mult_coat_ratio2 = FloatingPoint<Real>::safe_denominator(coat_ratio2, tol);
	dstcoat[2] = coat_ratio1 * mult_coat_ratio2;

Conversation

odiazib commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaizhangpnl left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 4, 2026

Codecov Report

Uh oh!

susburrows left a comment

Choose a reason for hiding this comment

Uh oh!

susburrows Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaizhangpnl Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

kaizhangpnl Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

odiazib Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

kaizhangpnl Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

susburrows Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

odiazib commented Feb 2, 2026 •

edited

Loading

susburrows Feb 5, 2026 •

edited

Loading

kaizhangpnl Feb 5, 2026 •

edited

Loading